Alpha-1-antitrypsin-deficiency is associated with lower cardiovascular risk: an approach based on federated learning

Background Chronic obstructive pulmonary disease (COPD) is an inflammatory multisystemic disease caused by environmental exposures and/or genetic factors. Inherited alpha-1-antitrypsin deficiency (AATD) is one of the best recognized genetic factors increasing the risk for an early onset COPD with emphysema. The aim of this study was to gain a better understanding of the associations between comorbidities and specific biomarkers in COPD patients with and without AATD to enable future investigations aimed, for example, at identifying risk factors or improving care. Methods We focused on cardiovascular comorbidities, blood high sensitivity troponin (hs-troponin) and lipid profiles in COPD patients with and without AATD. We used clinical data from six German University Medical Centres of the MIRACUM (Medical Informatics Initiative in Research and Medicine) consortium. The codes for the international classification of diseases (ICD) were used for COPD as a main diagnosis and for comorbidities and blood laboratory data were obtained. Data analyses were based on the DataSHIELD framework. Results Out of 112,852 visits complete information was available for 43,057 COPD patients. According to our findings, 746 patients with AATD (1.73%) showed significantly lower total blood cholesterol levels and less cardiovascular comorbidities than non-AATD COPD patients. Moreover, after adjusting for the confounder factors, such as age, gender, and nicotine abuse, we confirmed that hs-troponin is a suitable predictor of overall mortality in COPD patients. The comorbidities associated with AATD in the current study differ from other studies, which may reflect geographic and population-based differences as well as the heterogeneous characteristics of AATD. Conclusion The concept of MIRACUM is suitable for the analysis of a large healthcare database. This study provided evidence that COPD patients with AATD have a lower cardiovascular risk and revealed that hs-troponin is a predictor for hospital mortality in individuals with COPD. Supplementary Information The online version contains supplementary material available at 10.1186/s12931-023-02607-y.


Background
Chronic obstructive pulmonary disease (COPD) is an inflammatory and multifactorial disease [1], according to the World Health Organization data, it is the third leading cause of mortality worldwide [2].The COPD phenotype results from the complex interplay of environmental, health and genetic risk factors [1].A typical COPD patient often suffers from one or more comorbidities, specifically cardiovascular diseases (CVDs).There are many similarities between COPD and CVD and there are common risk factors, the most important of which is cigarette smoking.Persistent systemic inflammation, increased senescence, changes in the microbiome, and/or a genetic predisposition are determinants for coexistent comorbidities, such as CVDs but also osteoporosis, intellectual deterioration, muscle hypotrophy and cachexia [1,3,4].Therefore, one of the goals of COPD research is to recognize disease phenotypes and to develop biomarkers that enable clinicians to identify individuals with increased risk for comorbidities and mortality, and those in need of a more intensive personalized treatment.Inherited Alpha-1-Antitrypsin-Deficiency (AATD) is a genetic predisposition to COPD and pulmonary emphysema that affects 0.01-0.02% of the population in Europe and that makes the lungs more vulnerable to inhaled harmful particles [5][6][7].The most common AATD genotypes are Pi*Z (Glu342Lys) and Pi*S (Glu264Val) resulting from a point mutation in the SERPINA 1 gene encoding the alpha-1antitrypsin (AAT) protein, a broadspectrum protease inhibitor and immunomodulatory protein [8][9][10][11].The Pi*Z mutation in the SERPINA1 gene leads to AAT polymerization, intracellular accumulation, and low circulating levels of AAT [12][13][14][15].
As mentioned above, COPD patients have a high prevalence of cardiovascular risk factors and comorbidities [17].However, knowledge about the prevalence of CVDs in patients with AATD-related emphysema remains limited and controversial.Greulich et al. [16] evaluated health insurance data, unadjusted for important confounders such as smoking status, age, gender and body mass index (BMI), and showed a significantly lower cardiovascular risk and comorbidity profile in COPD patients with AATD compared to those without AATD.
In a large prospective German multicentre cohort study (COSYCONET), the authors considered smoking, age, gender, and BMI as confounding factors of COPD and investigated comorbidities in stable COPD patients with and without AATD [17].Here, in line with the previous study, the authors reported that patients with AATDrelated COPD have a significantly lower prevalence for cardiovascular comorbidities.In contrast, a recent study from US insurance databases did not identify significant lower prevalence of cardiovascular comorbidities associated with AATD-related COPD [18].Therefore, the aim of this study was to investigate in non-AATD and AATD COPD patients: (i) comorbidity profile (ii) high sensitivity troponin (hs-troponin) and (iii) blood glucose and lipid profiles.For this purpose, we analysed data in a real-life setting obtained from six medical centres of the Medical Informatics in Research and Care in University Medicine (MIRACUM) consortium [19].

Study population
We conducted a longitudinal study, which included patients diagnosed with COPD (primary or secondary diagnosis) during their hospitalization at one of the university hospital centres (Erlangen, Frankfurt, Freiburg, Giessen, Mainz and Marburg) pseudonymised as sites A to F. The data acquisition periods vary depending on the clinical centre, however, in each case the starting year has at least one AATD patient enrolled (for details see Supplementary).The university hospital centres belong to the MIRACUM consortium uniting ten university hospitals, two universities of applied sciences and one industrial partner spread over seven German states.The aim of the MIRACUM consortium is to make clinical data usable for research projects via modular, scalable and federated data integration centres [19][20][21].COPD patients were diagnosed based on a diagnosis coded as J44 or according to the International Classification of Diseases revision 10 (ICD-10) [21].In total, we were able to extract 112,852 visit1 information of 43,057 COPD patients with an average visit number of 2.19 per individual.Patients having AATD were defined based on a diagnosis code of E88.0 of the ICD-10 [22].

Data extraction and handling
Routine data were collected within the participating hospitals' routine care IT systems for all individuals under the local hospital laws.The data were retrieved from the database using the SQL language, pseudonymised and then imported into a standardized i2b2 database schema [23].Additional data wrangling transformations were performed using the statistical software R [24].In particular, continuous measurements that can be collected multiple times per visit were averaged across one visit to obtain one value per visit.Due to data protection / IT security constraints, an adapted DataSHIELD (Data Aggregation Through Anonymous Summary-statistics from Harmonised Individual levEL Databases) implementation was used for the analyses [25,26], where the sensitive data remain within each hospital centre and only anonymous aggregated data are shared.Using specific statistical methods, one can still obtain the same analyses results with this framework as with locally pooled data for specific statistical models [27].Each participating site uploaded selected cohort data from i2b2 database into its local DataSHIELD Open Policy Agent Layer (OPAL) server and access was granted to the analyst at each of the participating sites.This approach enabled the anonymous data analysis in line with the EU General Data Protection Regulation [28].All six local ethics committees approved the study protocol.

Measurements
For each visit of a COPD patient, the covariate information was extracted.Demographic characteristics included age at admission in years, gender and nicotine abuse indicated with an ICD-10 diagnosis F17.Considering that BMI is a key confounding factor to take into account in the analysis of AATD, an attempt was made to extract this data.However, the data had to be discarded subsequently due to an excessive number of missing values.All comorbidities were recorded as binary variables indicating the presence or absence of the comorbidity, and we assumed that a missing coded comorbidity indicated the absence of that comorbidity.Cardiovascular comorbidities included acute ischemic stroke, angina pectoris, arrhythmias, atrial fibrillation, carotid stenosis, coronary heart diseases, coronary sclerosis, heart failure, hypertension, ischemic heart, myocardial infarction, peripheral artery diseases and valvular diseases.Other diseases considered were bronchiectasis, diabetes, emphysema, liver diseases and renal failure.In addition, numerous laboratory parameters were retrieved.Laboratory biomarkers included triglycerides, Low Density Lipoprotein (LDL), High Density Lipoprotein (HDL), cholesterol, transaminases; Alanine Aminotransferase (ALAT or GPT) and Aspartate Aminotransferase (ASAT or GOT), glycosylated haemoglobin (HbA1c), platelets, haemoglobin and blood glucose.A high-sensitivity troponin was also included as a heart function biomarker.Neutrophils, lymphocytes, procalcitonin and C-reactive protein (CRP) were used as biomarkers related to systemic inflammation.Only complete cases were included in the analysis.A summary of the number of missing data per laboratory parameter can be found in Supplementary S5.Data extraction summaries based on ICD-10 codes for comorbidities and LOINC (Logical Observation Identifiers Names and Codes) codes for laboratory parameters are presented in Supplementary S2.

Statistical analysis
The statistical analysis was performed within the DataSHIELD framework.Descriptive statistics were presented as means and standard deviations (SD).Frequencies and percentages were used to report distributions of categorical variables.The χ 2 -test was used to compare the prevalence of the different comorbidities between the two COPD groups (AATD versus non-AATD).The mean laboratory blood glucose and lipid profiles in both groups were compared using Student's t-test.Generalized linear mixed effects regression models with a random effect per patient to account for multiple visits per patient and adjustments (age, gender and nicotine abuse) were used to estimate the effects of AATD for COPD patients on the odds of having a specific comorbidity.Similarly, inhospital mortality (binary coded as discharged alive or not) associated with the biomarker hs-troponin was predicted in both groups.Specifically, we used a logistic regression model (family: binomial, link: logit).Analogously, linear mixed effects models including a random effect per patient and adjustments were used to estimate the impact of blood glucose and lipid biomarkers on comorbidity status (family: gaussian, link: identity).The corresponding forest plots were generated with the forester package in R [29].The statistical analyses were performed using DataSHIELD version 6.1 [27] and R version 3.5.2[24].The reported probabilities values are two-sided and a threshold p-value of 0.05 was used to infer exploratory statistical significance.

Patient characteristics
Table 1 shows the characteristics of all patients included in this study (see Supplementary S6 for site-specific details).In total, 43,057 COPD patients were enrolled, of whom 42,311 (98.27%) had no reported diagnosis for AATD and 746 (1.73%) had AATD.This COPD cohort included more males than females (61.67% vs. 38.33%),which was also true for non-AATD and AATD COPD sub-cohorts.Approximately 12% of all COPD patients were diagnosed as nicotine abuser compared to 8.6% of the AATD-related COPD patients.The average age at first hospital admission was 65.87 years for AATDrelated, 70.17 years for non-AATD-related and 70.13 years for all COPD patients.The average number of hospital visits was higher for AATD patients than for Non-AATD patients (see Table 1).Only 3.22% of the COPD patients were coded as having asthma (see Strengths and limitations).The biomarkers, such as neutrophils, lymphocytes, procalcitonin and C-reactive protein (CRP), showed a higher average while the average value for haemoglobin and platelets was lower in AATD-related COPD patients.The higher CRP levels could hypothetically be related to the reduced anti-inflammatory effect of AAT.
In addition, we examined the influence of the inflammation biomarkers on the biomarker hs-troponin.Table 2 shows the results of the univariate linear regression for the enitre COPD cohort.The lack of a significant impact of neutrophils, procalcitonin, and CRP on hs-troponin led to exclusion of inflammation parameters from further analyses.

Prevalence of comorbidities in AATD versus Non-AATD COPD patients
The overall comorbidities defined according to the ICD-10 codes were compared between AATD and Non-AATD patients (see Table 3).Our data revealed that AATD-related COPD patients have significantly lower cardiovascular risk profiles as compared to non-AATD COPD patients.Our results also show that hypertension is slightly less frequent among AATD compared to non-AATD patients (60.12% vs. 62.52%, p-value = 0.136).In addition, we observed a significantly lower prevalence of liver diseases (liver cirrhosis and liver fibrosis) and renal failure in AATD-related COPD as compared to non-AATD COPD patients (4.63% vs. 23.49%and 19.14% vs. 28.48%,respectively).

Lower cardiovascular disease risk in COPD patients with AATD
Next, we focused on the impact of AATD on the various comorbidities in COPD patients.As illustrated in Fig. 1, AATD patients show a reduced risk of developing cardiovascular diseases.In addition, our findings imply that AATD patients have a lower risk of developing other comorbidities such as diabetes Results are for all sites except for Site B und Site D, 3 differences between groups calculated for all sites except for Site E and Site F, 4 differences between groups calculated for all sites except for Site A, 5 differences between groups calculated for all sites except for Site E
The results of the generalized linear mixed effect models adjusted for gender, age and nicotine abuse (Table 4b) confirmed that plasma HbA1c (MD = -0.52,p-value = 0.001) as well as total cholesterol (MD = -11.04,p-value = 0.007) were significantly lower in AATD than in non-AATD COPD patients.The values of these laboratory parameters are consistent with the observed lower risk of cardiovascular comorbidities in AATD compared to non-AATD COPD.

Predicting hospital mortality
We used hs-troponin levels as a biomarker to predict hospital mortality in the complete COPD cohort, and sub-cohorts of AATD and Non-AATD COPD patients.As shown in Table 5, hs-troponin was a poor predictor of hospital mortality in AATD patients (OR = 9.93, p-value = 0.330) while it was significant in predicting hospital mortality for Non-AATD and total cohort of COPD patients (OR = 1.85, p-value < 0.001).

Correlation of liver biomarkers in individuals with liver diseases
We also investigated to what extent liver biomarkers correlate with liver diseases in the entire COPD cohort and the sub-cohorts of AATD and non-AATD COPD patients.Liver diseases were defined as those covered by the coding algorithms for the Elixhauser comorbidities generated within the comorbidity package in R [30], including liver cirrhosis, liver fibrosis, liver failure, hepatitis, gastric and oesophageal varices.As presented in Fig. 1 Forest plot showing the adjusted influence of AATD status on different comorbidities.Odd ratios (OR) are reported as ln (Odds), and confounders include age, gender and nicotine abuse Table 6, COPD patients -overall and for the sub-cohorts AATD and non-AATD -without liver diseases show significantly lower values for ASAT and ALAT as compared to patients with liver diseases.In contrast, COPD patients without liver disease have significantly higher values for LDL (MD = 9.79, p-value < 0.001) and total cholesterol (MD = 16.04,p-value < 0.001).This applies to the sub-cohort of AATD COPD patients as well (LDL: MD = 9.86, p-value < 0.001; total cholesterol: MD = 16.08,p-value < 0.001).

Discussion
This is the first study on AATD related COPD integrating data from several university hospitals in a distributed analysis approach utilizing the technical infrastructure of the MIRACUM data integration centres.Moreover, the  This model was fitted using pooled data from the Site A and Site E cohorts only.AATD: Alpha 1-antitrypsin deficiency, hs-troponin: high sensitivity troponin MIRACUM database allowed the reassessment of plasma cardiac hs-troponin, as a major indicator of cardiovascular events and an independent predictor for mortality in patients with COPD [31].First, we confirmed the already known relationship between AATD and an increased prevalence of bronchiectasis [17,32].On the other hand, our data revealed that AATD-related COPD patients have a much lower risk of a cardiovascular disease than non-AATD COPD patients.Accordingly, the prevalence of acute ischemic stroke, angina pectoris, arrhythmias, atrial fibrillation, carotid stenosis, coronary heart diseases, ischemic heart, liver diseases and cardiac infarction were significantly lower than in individuals with non-AATD COPD.
Our findings are also in line with data reported by Greulich et al.The authors also found lower cardiovascular comorbidity profiles in COPD patients with AATD by examining health insurance data, however, without accounting for the confounding factors like age, nicotine abuse and BMI [16].The German prospective multicentre study COSYCONET also found a significantly lower prevalence of cardiovascular disease in AATD-related COPD patients [17].Although the COSYCONET results were based on the adjustment of age, gender, BMI and pack years of cigarettes, patient selection bias cannot be totally excluded because significantly younger AATD patients were enrolled due to family screening.Unlike in the COSYCONET cohort study, we only considered patients having COPD as a primary or secondary diagnosis.Despite the above differences in study designs, previously published [16,17] and current data support the notion that AATD-related COPD patients have a lower risk for cardiovascular diseases than Non-AATD COPD patients.The question remains why?According to the COSYCONET study, lower cardiovascular comorbidities in AATD than in non-AATD COPD patients can be associated with significantly lower plasma triglyceride concentrations and lower HbA1c in COPD patients with AATD [17].In contrast to the results of COSYCONET, we could not confirm that the long-term blood sugar parameter HbA1c is reduced in AATD COPD [17].A possible explanation for these conflicting findings could be that our current study is based on the larger patient cohort, which includes older individuals with AATD than in COSYCONET.On the other hand, our findings are in line with Bhattacharjee et al. who did not find a lower HbA1c in AATD patients with hepatic insufficiency [33].
Recently, Hamesch et al. [34] showed that Pi*ZZ AATD individuals have lower serum concentrations of LDL, VLDL and triglyceride, and related this finding to hepatic damage caused by intracellular accumulation of misfolded Z-AAT protein.These authors also provided evidence that Pi*Z-overexpressing mice have reduced expression of genes involved in lipid secretion and develop liver steatosis [34].
Our results revealed that plasma levels of total cholesterol are lower in patients with AATD, but not in terms of LDL cholesterol, HDL cholesterol and triglycerides (routinely measured serum total cholesterol is a sum of lipoprotein particles with high, low, intermediate and very low density (HDL, LDL, IDL a VLDL).Out of these, VLDL particles are almost exclusively synthesized in the liver and secreted into the bloodstream.IDL and LDL particles are derived from VLDL after the loss of free fatty acids.HDL particles are also largely produced in the liver, but their production in the small intestine is also important [35].
Because liver diseases lead to a decrease in serum levels of cholesterol and triglycerides [36], this might explain why individuals with AATD -which is often associated with liver disease [37] -have lower levels of total cholesterol.In fact, lower cholesterol levels are in turn associated with a lower risk of cardiovascular diseases [38].Patients with AATD have a more vulnerable liver, indicated by the elevated liver transaminases ASAT and ALAT, which were significantly higher in patients with AATD in this study.This is congruent with the results of COSYCONET that confirmed lower levels of total cholesterol and a lower prevalence of cardiovascular comorbidities in individuals with AATD [17].Patients with AATD might secrete less total-cholesterol due to the accumulation of misfolded AAT in the hepatocytes and thus might have a lower prevalence of cardiovascular comorbidities.Contrary to expectations, the real life data showed fewer liver diseases in patients with AATD than in COPD patients without AATD.The lower prevalence of liver disease in AATD is most likely due to bias, since it is undisputed that individuals with AATD show a higher association with liver disease [37].A very likely reason could be that the liver diseases were not always recorded in the hospital bills, especially since only diagnoses that lead to increased reimbursement by the health insurance companies are coded.This also shows that the coded data should always be interpreted from the point of view of coding practice, which is a limitation of such big data studies.Another reason could also be the lack of awareness in clinical routine that AATD can also be a disease of the liver [39].For this reason, available laboratory values should also be included in the interpretation, such as the elevated transaminases ALAT and ASAT in AATD as done in our study.
Moreover, the evaluation of 'big data' in MIRACUM is also suitable for the analysis of the biomarker hs-troponin in terms of prediction of mortality in individuals with COPD.We confirmed the results of Waschki et al. from the German prospective cohort study COSYCONET, that hs-troponin is a suitable biomarker for estimating mortality in subjects with stable COPD.While Waschki et al. examined subjects with stable COPD, we were able to analyse real life data that also includes patients with exacerbated COPD.

Strengths and limitations
One strength of our study is the consideration of a big sample size from the MIRACUM consortium data integration centres to provide routinely collected data on Non-AATD and AATD COPD.The use of the DataSHIELD platform enabled the analysis of data from different cohorts of the MIRACUM consortium, hence increasing the statistical power of our analysis while preserving the privacy of patient data.Another advantage of MIRACUM was that, in contrast to COSYCONET, we had no selection bias in AAT patients.A further advantage was that important confounders for comorbidities could be taken into account.Some variables and biomarkers (e.g., hs-troponin) were not recorded across all participating cohorts with the consequence, that some results such as prevalence, effect estimates and predictions could not be generalized to our entire study population but rather to individual cohorts.In addition, despite the use of pooled data, the proportion of AATD patients was still relatively small.This is attributed to the rarity of this orphan disease and might have affected the significance as well as the precision of some differences.Another limitation is that BMI was only rarely recorded and therefore not available for the analysis.Furthermore, we were only able to adjust for a coded nicotine abuse and not the current smoking status.In addition, we could not provide any information on AAT augmentation therapy of patients with AATD, which could also have an impact on blood lipids, since this is not coded separately in Germany [40].Up to now, the data available in the MIRACUM consortium are mostly coded with reimbursement in mind, and not research.Thus, socalled upcoding (i.e., using fitting ICD-10 and OPS codes with the highest reimbursement amount, not the highest clinical relevance) and selective coding (i.e., information not relevant for reimbursement is only coded rarely) can lead to biased results.Data on exacerbations were not available because they were not always coded.Currently, we were unable to consider lung function information because these measurements were too heterogeneous from site to site and were often missing.Nevertheless, the combination of the data from six university hospitals leads to a reliable statement due to the large number of cases.

Conclusion
We conclude that the concept of MIRACUM is feasible for the analysis of a large healthcare database that provided important data, especially for orphan diseases.We have proved evidence that COPD patients with AATD have a lower cardiovascular risk.In addition, our real world data showed that hs-troponin is a predictor for hospital mortality in patients with COPD.

Table 1
Baseline characteristics of the combined study population.For 'Number of individuals' the percentage values correspond to the total number of COPD patients, for gender and nicotine abuse to the specific sub-cohort (i.e., COPD total, non-AATD, AATD, respectively).p-values are calculated using the Chi-squared test for categorical variables or t-test for comparing the mean values * Presented as mean (SD), AATD: Alpha 1-Antitrypsin Deficiency,1Results are just for five sites due to disclosure risks in the Site C cohort, 2

Table 2
Influence of Biomarkers related to systemic inflammation on hs-troponin using univariate linear regression This model was fitted using pooled data from the Site A and Site E cohorts only

Table 3
Prevalence of comorbidities in total COPD cohort, and subgroups of non-AATD and AATD COPD patients.The data are given as a percentage.All p-values are calculated with the Chisquared test for two groups

Table 4
Mean differences of laboratory parameters in all COPD patients.The confounders for the adjusted mean differences include age, gender and nicotine abuse.Unadjusted mean differences for AATD vs. Non-AATD COPD patients in laboratory parameters + The mean difference for AATD (n = 55) is relative to Non-AATD (n = 4365) and the p-value is that of the student's t-test comparison.CI: Confidence Interval, U/L: Units per litre, mg/dl: Milligrams per decilitre, mmol/mol: Millimoles per mole, AATD: Alpha 1-Antitrypsin Deficiency, ASAT: Aspartate Aminotransferase, ALAT: Alanine Aminotransferase, HbA1c: Glycosylated haemoglobin, HDL: High Density Lipoprotein, LDL: Low Density Lipoprotein *The difference here includes all sites except Site F cohort (n.a is due to DataSHIELD disclosure controls)

Table 5
Table assessing the influence of hs-troponin on predicting hospital mortality

Table 6
Adjusted mean effects of liver diseases on liver transaminases, LDL cholesterol and total cholesterol.Confounding factors include age, gender and nicotine abuse